home *** CD-ROM | disk | FTP | other *** search
-
- Bzip2 is not research work, in the sense that it doesn't present any
- new ideas. Rather, it's an engineering exercise based on existing
- ideas.
-
- Four documents describe essentially all the ideas behind bzip2:
-
- Michael Burrows and D. J. Wheeler:
- "A block-sorting lossless data compression algorithm"
- 10th May 1994.
- Digital SRC Research Report 124.
- ftp://ftp.digital.com/pub/DEC/SRC/research-reports/SRC-124.ps.gz
-
- Daniel S. Hirschberg and Debra A. LeLewer
- "Efficient Decoding of Prefix Codes"
- Communications of the ACM, April 1990, Vol 33, Number 4.
- You might be able to get an electronic copy of this
- from the ACM Digital Library.
-
- David J. Wheeler
- Program bred3.c and accompanying document bred3.ps.
- This contains the idea behind the multi-table Huffman
- coding scheme.
- ftp://ftp.cl.cam.ac.uk/pub/user/djw3/
-
- Jon L. Bentley and Robert Sedgewick
- "Fast Algorithms for Sorting and Searching Strings"
- Available from Sedgewick's web page,
- www.cs.princeton.edu/~rs
-
- The following paper gives valuable additional insights into the
- algorithm, but is not immediately the basis of any code
- used in bzip2.
-
- Peter Fenwick:
- Block Sorting Text Compression
- Proceedings of the 19th Australasian Computer Science Conference,
- Melbourne, Australia. Jan 31 - Feb 2, 1996.
- ftp://ftp.cs.auckland.ac.nz/pub/peter-f/ACSC96paper.ps
-
- All three are well written, and make fascinating reading. If you want
- to modify bzip2 in any non-trivial way, I strongly suggest you obtain,
- read and understand these papers.
-
- I am much indebted to the various authors for their help, support and
- advice.
-
-